152 research outputs found
SEGAN: Speech Enhancement Generative Adversarial Network
Current speech enhancement techniques operate on the spectral domain and/or
exploit some higher-level feature. The majority of them tackle a limited number
of noise conditions and rely on first-order statistics. To circumvent these
issues, deep networks are being increasingly used, thanks to their ability to
learn complex functions from large example sets. In this work, we propose the
use of generative adversarial networks for speech enhancement. In contrast to
current techniques, we operate at the waveform level, training the model
end-to-end, and incorporate 28 speakers and 40 different noise conditions into
the same model, such that model parameters are shared across them. We evaluate
the proposed model using an independent, unseen test set with two speakers and
20 alternative noise conditions. The enhanced samples confirm the viability of
the proposed model, and both objective and subjective evaluations confirm the
effectiveness of it. With that, we open the exploration of generative
architectures for speech enhancement, which may progressively incorporate
further speech-centric design choices to improve their performance.Comment: 5 pages, 4 figures, accepted in INTERSPEECH 201
When the state of the art is ahead of the state of understanding : unintuitive properties of deep neural networks
Deep learning is an undeniably hot topic, not only within both academia and industry, but also among society and the media. The reasons for the advent of its popularity are manifold: unprecedented availability of data and computing power, some innovative methodologies, minor but significant technical tricks, etc. However, interestingly, the current success and practice of deep learning seems to be uncorrelated with its theoretical, more formal understanding. And with that, deep learning?s state-of-the-art presents a number of unintuitive properties or situations. In this note, I highlight some of these unintuitive properties, trying to show relevant recent work, and expose the need to get insight into them, either by formal or more empirical means
- …